In line with government policies, Taipower hopes to be a builder of smart grids. In addition to growing performance for power generation, the organization gradually improves power transmission and distribution efficiency. Also, Taipower aims to power technology innovation by promoting automatic power distribution and construction of power-loop supply. Planning and constructing a smart grid further increase the reliability of power supply. During the construction process of the smart grid, in addition to ensuring the security of the transmission and distribution network, it is essential to provide the security of AMI data transmission that belongs to the scope of OT security. Taipower has built the system architecture of the smart grid based on international security regulations (ISO/IEC 62443, NERC CIP and NIST SP 800-82) to ensure compliance with security assurance levels.
The ISO/IEC 62443 standard specification defines a series of security requirements and introduces security into four levels. Except for the top level, which concerns national security, the other three levels are described as follows: Security Level 1 (SL1) is designed to prevent accidental or incidental violations.
Security Level 2 (SL2) contains the specifications of SL1 and adds 23 extended specifications. The Security Level 3 (SL3) specification includes the specifications specified in SL2 and adds 30 extended specifications. Besides, the IEC 62443-3-3 specification defines approximately 37 general safety requirements and is comprehensive information for critical infrastructure equipment.
3-1. System Architecture
Based on the ISO/IEC 62443 standard specification mentioned above, the primary task of the security protection designed for Taipower’s smart grid system architecture is to protect the integrity of the company’s IT/OT assets and risk management by developing the company’s cybersecurity policy. The following describes the steps in the system architecture design:
- Develop an Enterprise Security Policy:
In the enterprise security policy, all devices connected to the network are reviewed in detail, the network topology of all connected devices is well-designed, the device security configuration is regularly checked, and potential system vulnerabilities are evaluated. The cybersecurity policy has an impact on the on-site units, employees and company operations. Before any changes of the operational procedures, it is necessary to re-examine the procedures changed is in compliance with the security policy. Otherwise, the system may be mistaken for security and not aware of the potential hacker attack possibilities. - Establish an Independent Network Segment
After developing the company’s cybersecurity policy, the network segment can be divided by corporate headquarters, power plant, electric distribution field, and project site field area. The security solution enhancement is deployed against those different network segments according to the cybersecurity policy. - Inter-network Connection Protection
In this step, the connection channel between the network segments is adequately protected. This part focuses on data transmission and remote connection protection. At the same time, it also strengthens the protection mechanism for ICS devices to improve their ability to resist cyber attacks and reduce the possibility of device damage in the event of a security incident. - Monitoring Facilities and System Updates
In addition to detecting potential security breaches by actively monitoring network activities, it is essential to regularly perform software updates and hardware device replacement to address the security issues caused by security vulnerabilities. The risk of security has always existed even though the program- controlled equipment can provide the function as it is designed. For example, if the system uses older program-controlled devices and lacks some of the necessary security features, placing a firewall in front of the PLC can meet the protection requirements.
Taipower’s smart grid cybersecurity architecture includes five parts:
The overall information system: in the internal IT environment of the enterprise, all the devices are connected through the firewall and the Internet. In this area, the employee computer, database and various devices such as printers and Wi-Fi are included, and others like Virtual Private Network (VPN) connections are also protected by firewall.
Operation management system: in the SCADA/HMI device, two sets of firewall devices are used for physical network segmentation. The network architecture is designed as an independent physical network segment because the operations need to read data from the OT device and then transmit data to the IT/OT DMZ area. Therefore, the SCADA/HMI can simultaneously engage OT devices in different fields and control the central data then backhaul to the IT/OT Zone.
On-site real-time monitoring: firewall provides the overall security protection, including the sensor, PLC control unit, computer room, and engineering areas are all divided by the firewall and its virtual network segment protection area.
Security Operation Center (SOC): in addition to monitoring each device and service, the SOC also provides the release of security rules that instantly updates the protection functions of each endpoint.
Besides SOC, Taipower Company has intergated the solution, MDR service. MDR is a Managed Detection and Response threat detection and response service that allows companies or enterprises to handle information security incidents in a timely and effective manner through the services of an outsourced information security professional team when an information security incident occurs, so as to avoid the expansion of the impact. If EDR is to detect malicious program intrusions, identify and respond to possible threats, especially endpoint devices (such as PCs, laptops); then MDR is a collaborative service for threat detection and response. When an endpoint is detected to be compromised by information security In the event of an incident, collaborate with the professional information security team to provide effective solutions, hoping to converge and eliminate the threat.
TPC-Information Sharing and Analysis Center (TPC-ISAC): Taipower has founded the ISAC Center within the company to exchange security incident with SOC. TPC-ISAC also reports and updates corresponding event information to the higher-level agency, N-ISAC.
Figure 1 The TPC-ISAC Architecture
For the IT/OT cybersecurity of smart grid, there are three lines of cybersecurity defense and information sharing mechanism.
The first line of defense: firewall or unidirectional gateway(data diode)
- Because the OT network is possible to connect with the Internet, the industrial-grade firewall equipment is the first line for network segmentation and security protection.
- To avoid being infected by malware like ransomware and the spreading to different OT field.
- Unidirectional data transmission: Transfering data from the OT network to the IT network in oneway to avoid high-risk Internet packets or malware attacks.
The second line of defense: IDS
- Even with the network segmentation, it is still possible for the hacker to penetrate the network and access the inside OT systems. So the passive intrusion detection mechanism (IDS) becomes the second line of defense.
- To estimate the feasibility of IDS system in OT network operation, Taipower company has set up a demonstration site for smart grid network in Kingmen county, to make sure the staff can find the extraordinary status immediately when the power plant and IEC61850 auto power substation didn’t follow the baseline operation、unauthorized plug-in equipment or detect malicious network packet or data, by which staff can accelerate the examination of event verification and get rid of it. In the middle of 2020 estimated, we will establish the whole system in complete and SOC center will monitor the it as well.
- Except for IDS demonstration site for smart grid network in Kingmen county in 2020, Taipower company will extent the scale of OT IDS in sequence at power plant (Taichung), power substation department (Taichung) and power distribution department (Yulin). Basically, it provides information assets, network topology, Purdue model for the OT maintenance unit to establish the function of baseline operation and SOC center will start to monitor it as well in the end of 2020.
Figure 1-2 Intrusion Detection Mechanism (IDS) as the second line of defense
The third line of defense: application white-list
- It is applied because it is difficult for OT equipment to maintain system updates like a virus pattern or service patch that may affect system stability.
- The main idea is to maintain a white-list that allows the designed application or service component like new control workstations, HMI human-machine interfaces or program-controlled equipment to operate.
It works as the last line of defense and keeps the system from implanted malware or infected viruses that could damage system availability. - In Nov 2019, we have finished the research project of Application Whitelist Mechanism, which will apply its function requirement for smart grid cyber security relative system by reference the project report.
- Department of Power System Operations in Taipower company has planned to update EMS (Energy Maintain System) by applying Application Whitelist Mechanism function requirement. Those function requirements satisfy the regulation of network firewall, network segmentation and IDS. In EMS cyber security meeting, we have share Application Whitelist Mechanism function requirement with constant company which planning for the regulation rule establishment of EMS, by which they agree it to enforce the new EMS cyber security strength and adopt it for this case. With the new EMS establishment schedule, we will keep on addressing the importance of Application Whitelist Mechanism function requirement.
Figure 1-3 EMS (Energy Maintain System) function requirement.
Cybersecurity Information Sharing
- Cybersecurity information sharing and analysis mechanisms are based on the characteristics of the Smart Grid: cross-regional, mixed network (wire/wireless) architectures, and 365 days of full-time operation. The external information sharing and internal real-time monitoring are the joint defense which enhances the cybersecurity.
3-2. Overview of Key Function
In addition to the ISO/IEC 62443 standard, the company also takes the control measures for ICS security protection proposed by NIST SP 800-82 into consideration. It enhances the network and environmental protection based on Taiwan cybersecurity protection benchmark. ICS includes SCADA, DCS, PLC, Programmable Automation Control (PAC), HMI and Instrumentation & Control (I&C). Taipower’s smart grid security protection specifications are divided into 11 categories such as “Industrial Control System Network Architecture”, “Access Control”, “Audit and Responsibility”, “Contingency Planning”, “Identification and Authentication”, “System and Communication Protection”, “System and Service Acquisition”, “Physical Protection”, “System and Information Integrity”, “Configuration Management”, “Organization Management”, and emphasize the integrity of ICS related systems and data usability.
1. ICS Network Architecture
The industrial control system network is vulnerable at the edge between the ICS network and the IT network. Based on the difference between the industrial control system and the IT network, the industrial control system should plan the network architecture and enhance the boundary protection according to its characteristics.
- Network Segregation
Due to business operation requirements, ICS data must be accessed and monitored through the Internet and via IT device. After the two heterogeneous systems are connected, they are linked to each other to increase mutual security threats and hidden dangers if there is no appropriate security protection mechanism. Based on the above reasons, the company has set up a firewall separating different network segments: - Demilitarized Zone, DMZ
The DMZ is set up between the internal IT network and the ICS control network. The data in this zone, like a historical database, must be available for both two network segments. The design keeps equipment or machines in two network segment from directly accessing each other, thus reducing the risk of industrial control systems being attacked. - Firewall requirement
The firewall must have two or more network packet filtering functions such as HTTP and Modbus, and it must have the capability to control the connection among the ICS control network, the DMZ, and the internal network. - Boundary Protection
The boundary protection device controls and filters the traffic of the ICS control network.
2. Access Control
In the ICS field, there are security vulnerabilities such as low password complexity, remote connection to the system for remote maintenance or control management, and excessive system authorization. As a result, user account management, restricted remote access, and access control to wireless network facilities, and unauthorized access are the primary concerns. It mainly focuses on the following subjects.
- Account Management
- Remote Access
- Least Privilege
- Wireless Management
3. Audit and Accountability
When the industrial control system has a security incident, it is merely impossible to handle it due to the missing details that may lead to repeated occurrence of similar security incidents. Based on this issue, relevant recommendations and information are collected for audit. In most industrial control systems, the auditing function and auditing tools are not supported. However, for those accountability systems, it has to provide audit events, audit record contents, audit storage capacity, and audit failure processing time to meet the audit requirements. It mainly focuses on the following items:
- Audit Events
- Content of Audit Records
- Audit Storage Capacity
- Response to Audit Processing Failures
- Time Stamps
- Protection of Audit Information
4. Contingency Planning
Industrial control systems have fixed-position physical components. When a component suddenly fails to work, the provided service can be interrupted if there is no immediate replacement backup solution. At this point, the organization should initiate the emergency response plan to the recovery of the operation. It mainly focuses on the following items.
- Contingency Plan
- Safe Mode
- Control system backup
- The characteristic specialty of the ICS
5. Identification and Authentication
Authentication matters because industrial control systems often have security issue like unauthorized access from users, or multiple people share a group of system accounts. When establishing identification and its corresponding protection measures, it is necessary to consider if the countermeasure affects the system performance, and then adjust the security solutions according to the target environment. The scope of identification and authentication system includes user, device, and authentication information feedback. It mainly focuses on the following items.
- Organizational Users Identification and Authentication
- Device Identification and Authentication
- Authenticator Management
- Authenticator Feedback
6. System and Communications Protection
There are common vulnerabilities in ICS environments, such as plain-text transmission, lack of integrity check and missing configuration backup. Considering the performance and the availability of the system, protection of communication, it requires the confidentiality and integrity of data transmission/storage, along with the adjustment according to the target environment. The main includes the following items.
- Transmission Confidentiality and Integrity
- Protection of Data storage
7. System and Services Acquisition
The cyber threats may be caused by the incomplete provision of external service providers. The scope of cyber defense includes external system service and corresponding documents.
- External System Services
- System Documentation
8. Physical and Environmental Protection
ICS’s common security threats in physical environment include lack of backup power, temperature and humidity control, and human entity access. Therefore, it should provide advice for entity access
authorization, physical access control, physical access monitoring, emergency power, temperature and humidity control, water, damage protection and access by third parties/accompaniers. Listed below are the main focuses.
- Physical Access Authorizations
- Physical Access Control
- Monitoring Physical Access
- Emergency Power
- Temperature and Humidity Control
9. System and Information Integrity
The ICS system features lack of security countermeasures such as malware protection, bug fixes, updates, and system monitoring. Consequently, the ICS is exposed to cyber threats when it connects to the Internet. Therefore, bug fixes, malware protection, system monitoring, and protection of predictable fault are proposed to improve both the system and information integrity. Its main focuses are:
- Flaw Remediation
- Malicious Code Protection
- System Monitoring
- Fault Tolerance
10. Configuration Management
Common security vulnerabilities include control of configuration change and the authority of system administration. The main items are listed below.
- Configuration Change Control
- Least Functionality
11. Organization Management
Common organizational management weaknesses in the ICS domain include lack of security-awareness training, inadequate management plans, and incomplete security procedures. This control classification provides recommendations for outsourcing management, employee management, risk management and incident response at the business management level. Its main focuses are the following items.
- Outsourcing Management
- Personal Management
- Risk Management
- Incident Response
3-3 Cyber Threat Monitoring and Analysis (Security Operation Center, SOC)
3-3-1 Functional Architecture
At present, the SOC monitoring center of Taipower Company collects the logs by the front-end log collector (FSA) from each security device and then sends to the Security Information Event Management (SIEM) for real-time monitoring.
Figure 2 The monitoring architecture schematic
3-3-2 Monitor Scope
We collect the devices types, including mainly firewall, WAF, intrusion detection system (IDS), various servers, routers, and so on. The list of quantities is as follows:
3-3-3 Rules Design
The basis of rule development comes from the judgment of the front-end devices monitoring the behavior of the attackers. In addition to the accurate and in-depth analysis of the results obtained from a single device, for subtle or high-risk attack patterns, the SOC uses active learning with automatic analysis mechanisms to correlate multiple sources of devices logs to discover hidden threats. In the meantime, it provides 7x24 non- stop active detection mechanism and alert service which helps block the connection behavior at the first moment to minimize the possible damage.
Event type | Description | Detection Source |
---|---|---|
DDoS Attack | A large number of abnormal connections from outside to inside may cause service interruption | IDS/IPS、WAF、Firewall、DDoS |
Virus/Worm | Infected by the virus or worm which cannot be removed | IDS/IPS、Antivirus、APT、EDR |
Abnormal Network behavior | Unreasonable network connection behaviors may be malicious or incorrectly configuration set-up | IDS/IPS、WAF、Firewall、APT、Proxy、EDR |
Intrusion attack | The contents of the packet are identified as malicious | IDS/IPS、WAF、APT |
3-3-4 Concept of Rules
The design of rules is according to the methodology of attack, the level of influence, and the type of security devices. The concept to identify the attack is described as follow.
- Denial-of-Service attack: it means a large number of continuous external connections to internal hosts in a short period. Besides, the same port for different hosts or different ports for the same host can be distinguished as Port Sweep or Port Scan behavior respectively. The SIEM platform is able to develop the rules and leverage the power of the front-end security devices to detect this kind of attack.
- Virus/Worm: this type of attack is usually launched from malicious intra-network neighbors by using specific NetBIOS protocols to transmit malware. Therefore, the rule focuses on those specific transmission ports and correlate the alert derived from anti-virus or APT (Advanced Persistent Threat) tools to discover this attack.
- Abnormal network behavior: not all unusual behaviors of network connections are malicious. Some of them may be caused by users violating the corporate security policy or the system misconfiguration. The rule should be designed more carefully and take the various factors into account, just like streaming audio may be treated as different results in two separate units with the different network policy.
- Intrusion attack: comparing to abnormal network behavior, the alert is usually generated from security devices instead of end-user. The front-end security devices can examine the network transmission packets and identify the malicious connections, such as penetration toward system vulnerabilities or misconfiguration. The explanation of rule design is clear, and the alert is generated with details like event name and CVE number.
The real-time rule means it has been well designed, test and debug before go-live. However, the intrusion methodology is sophisticated, and there is always a 0-day attack on the Internet. So the SOC has to pay attention to the latest announce and never stop fine-tune the real-time rules. All the developed rules are considered as a foundation building block, and they can be used to design the new generation of monitoring rules. In the meantime, the SOC continuously collects information on the security incident and profiles its attacking methodology to enhance its defense coverage. Besides, the existing rule optimization is essential to increase accuracy and thus avoid the excessive false alarm.